In this SQL project, I use sql to explore covid 19 data and generate various insights. Here is a step to step walk through of the project.
I started by importing the two sheets of data for this project.
I then had an overall overview of the data to understand it.
With this analysis, I examined the relationship between total reported cases and total deaths due to COVID-19 in various countries. By comparing these two metrics, we aim to determine the case fatality rate (CFR), which indicates the likelihood of dying if one contracts COVID-19. This measure is critical for understanding the severity of the disease in different regions
I then looked at Total Cases vs Population. This analysis investigates the proportion of the population that has been infected with COVID-19 by comparing total reported cases to the total population in various countries. By calculating this percentage, we aim to provide insights into the spread and prevalence of the virus within different regions.
I then did another analysis focusing on identifying countries with the highest rates of COVID-19 infection relative to their population size. Here, we aim to highlight regions where the virus has spread most extensively in proportion to the number of inhabitants
I then explored countries with the highest number of COVID-19 deaths relative to their population size. By examining this metric, we aim to identify regions where the impact of the virus in terms of mortality has been most severe compared to the number of inhabitantsfirst by country and second by continent.
I then did a global analysis of the data. This analysis provides a comprehensive overview of the daily global trends in the COVID-19 pandemic, focusing on three key metrics: the number of new cases, the number of new deaths, and the daily mortality rate. By examining these metrics, we aim to understand the ongoing impact of the virus on a global scale, identify patterns in the spread and severity of the disease.
With this analysis, I examined the progress of COVID-19 vaccination efforts by comparing the total population with the number of individuals who have received at least one dose of the vaccine. By evaluating the percentage of the population that has been partially vaccinated, this aims to provide insights into the reach and effectiveness of vaccination campaigns worldwide. I employed the use of CTEs in the first instance and did the same analysis a second time using Temporary Tables.
This final step involved setting up and storing a structured data view that will store comprehensive COVID-19 statistics, that will facilitate the creation of detailed visualizations. The view included key metrics such as continent, location, date, population and new population.
To download and view the full project on GitHub, click here.